Enumerating Distinct Decision Trees
نویسنده
چکیده
The search space for the feature selection problem in decision tree learning is the lattice of subsets of the available features. We provide an exact enumeration procedure of the subsets that lead to all and only the distinct decision trees. The procedure can be adopted to prune the search space of complete and heuristics search methods in wrapper models for feature selection. Based on this, we design a computational optimization of the sequential backward elimination heuristics with a performance improvement of up to 100×.
منابع مشابه
Enumerating the junction trees of a decomposable graph.
We derive methods for enumerating the distinct junction tree representations for any given decomposable graph. We discuss the relevance of the method to estimating conditional independence graphs of graphical models and give an algorithm that, given a junction tree, will generate uniformly at random a tree from the set of those that represent the same graph. Programs implementing these methods ...
متن کاملEfficient Discovery of Frequent Unordered Trees
Recently, an algorithm called Freqt was introduced which enumerates all frequent induced subtrees in an ordered data tree. We propose a new algorithm for mining unordered frequent induced subtrees. We show that the complexity of enumerating unordered trees is not higher than the complexity of enumerating ordered trees; a strategy for determining the frequency of unordered trees is introduced.
متن کاملUsing Sparsi cation for Parametric Minimum Spanning Tree Problems
Two applications of sparsiication to parametric computing are given. The rst is a fast algorithm for enumerating all distinct minimum spanning trees in a graph whose edge weights vary linearly with a parameter. The second is an asymptotically optimal algorithm for the minimum ratio spanning tree problem, as well as other search problems, on dense graphs.
متن کامل. ' , Faults Discovery By Using Mined Data
Abstrucr-Fault discovery in the complex systems consist of model based reasoning, fault tree analysis, rule based inference methods, and other approaches. Model based reasoning builds mdeis for the systems either by inathematic formulations or by experiment model. Fault Tree Analysis shows the possible causes of a system malfunction by enumerating the suspect components and their respective fai...
متن کاملEnumerating All Maximal Frequent Subtrees
Given a collection of leaf-labeled trees on a common leafset and a fraction f in (1/2,1], a frequent subtree (FST) is a subtree isomorphically included in at least fraction f of the input trees. The well-known maximum agreement subtree (MAST) problem identifies FST with f = 1 and having the largest number of leaves. Apart from its intrinsic interest from the algorithmic perspective, MAST has pr...
متن کامل